Finite-Sample Analysis of Lasso-TD

نویسندگان

  • Mohammad Ghavamzadeh
  • Alessandro Lazaric
  • Rémi Munos
  • Matthew W. Hoffman
چکیده

In this paper, we analyze the performance of Lasso-TD, a modification of LSTD in which the projection operator is defined as a Lasso problem. We first show that Lasso-TD is guaranteed to have a unique fixed point and its algorithmic implementation coincides with the recently presented LARS-TD and LC-TD methods. We then derive two bounds on the prediction error of Lasso-TD in the Markov design setting, i.e., when the performance is evaluated on the same states used by the method. The first bound makes no assumption, but has a slow rate w.r.t. the number of samples. The second bound is under an assumption on the empirical Gram matrix, called the compatibility condition, but has an improved rate and directly relates the prediction error to the sparsity of the value function in the feature space at hand. For the full version of this work, please refer to Ghavamzadeh et al. (2011).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Time-Discontinuous Finite Element Analysis of Two-Dimensional Elastodynamic Problems using Complex Fourier Shape Functions

This paper reformulates a time-discontinuous finite element method (TD-FEM) based on a new class of shape functions, called complex Fourier hereafter, for solving two-dimensional elastodynamic problems. These shape functions, which are derived from their corresponding radial basis functions, have some advantages such as the satisfaction of exponential and trigonometric function fields in comple...

متن کامل

Termination Analysis by Learning Terminating Programs

We present a novel approach to termination analysis. In a first step, the analysis uses a program as a black-box which exhibits only a finite set of sample traces. Each sample trace is infinite but can be represented by a finite lasso. The analysis can ”learn” a program from a termination proof for the lasso, a program that is terminating by construction. In a second step, the analysis checks t...

متن کامل

Finite Sample Analysis for TD(0) with Linear Function Approximation

TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such a result. Works that managed to obtain concentration bounds for online Temporal Difference (TD) methods analyzed modified versions of them, carefully crafted f...

متن کامل

Finite-Sample Analysis of Proximal Gradient TD Algorithms

In this paper, we show for the first time how gradient TD (GTD) reinforcement learning methods can be formally derived as true stochastic gradient algorithms, not with respect to their original objective functions as previously attempted, but rather using derived primal-dual saddle-point objective functions. We then conduct a saddle-point error analysis to obtain finite-sample bounds on their p...

متن کامل

Estimation Consistency of the Group Lasso and its Applications

We extend the `2-consistency result of (Meinshausen and Yu 2008) from the Lasso to the group Lasso. Our main theorem shows that the group Lasso achieves estimation consistency under a mild condition and an asymptotic upper bound on the number of selected variables can be obtained. As a result, we can apply the nonnegative garrote procedure to the group Lasso result to obtain an estimator which ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011